Sparse PCA with Oracle Property

نویسندگان

  • Quanquan Gu
  • Zhaoran Wang
  • Han Liu
چکیده

In this paper, we study the estimation of the k-dimensional sparse principal subspace of covariance matrix Σ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations. In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank-k, and attains a [Formula: see text] statistical rate of convergence with s being the subspace sparsity level and n the sample size. Compared to existing support recovery results for sparse PCA, our approach does not hinge on the spiked covariance model or the limited correlation condition. As a complement to the first estimator that enjoys the oracle property, we prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA, even when the previous assumption on the magnitude of the projection matrix is violated. We validate the theoretical results by numerical experiments on synthetic datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Principal Component Analysis in Very High-dimensional Spaces

Principal component analysis (PCA) is widely used as a means of dimension reduction for high-dimensional data analysis. A main disadvantage of the standard PCA is that the principal components are typically linear combinations of all variables, which makes the results difficult to interpret. Applying the standard PCA also fails to yield consistent estimators of the loading vectors in very high-...

متن کامل

Consistency of sparse PCA in High Dimension, Low Sample Size contexts

Sparse Principal Component Analysis (PCA) methods are efficient tools to reduce the dimension (or number of variables) of complex data. Sparse principal components (PCs) are easier to interpret than conventional PCs, because most loadings are zero. We study the asymptotic properties of these sparse PC directions for scenarios with fixed sample size and increasing dimension (i.e. High Dimension,...

متن کامل

MATLAB User Guide for Depth Reconstruction from Sparse Samples

I. Reconstruction functions: Demonstration code: 1. xout=ADMM WT(S,b,param) Demo ADMM WT.m 2. xout=ADMM WT CT(S,b,param) Demo ADMM WT CT.m 3. xout=ADMM outer(S,b) Demo Multiscale ADMM WT CT.m. II. Sampling functions: Demonstration code: 1. S = Oracle Random Sampling( x0, sp ) Demo Oracle Random Sampling.m 2. S = Oracle Random Sampling with PCA( x0, sp, Spilot ) Demo Oracle Random Sampling with ...

متن کامل

Strong Oracle Optimality of Folded Concave Penalized Estimation By

Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed...

متن کامل

Strong Oracle Optimality of Folded Concave Penalized Estimation.

Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Advances in neural information processing systems

دوره 2014  شماره 

صفحات  -

تاریخ انتشار 2014